# Replication files for "Divide to Conquer: Using Wedge Narratives To Influence Diaspora Communities" (Patrick Chester and Audrye Wong).

This repository contains the requisite data and code needed to replicate the tables and figures featured in "Divide to Conquer:Using Wedge Narratives To Influence Diaspora Communities" (Patrick Chester and Audrye Wong, Security Studies).

## 1. REPLICATION SCRIPT

All analyses are conducted in R.

All figures and tables in the main article text and in the online appendix can be compiled by running the master file "01_code/replicate.R". This file will call the individual analysis scripts located in the same folder. Make sure to install all R libraries called at the beginning of "01_code/replicate.R" on your system.

For details and/or partial replication, open and run the respective source scripts. They run independently, provided (for the R scripts) that you have loaded all libraries and the respective analysis datasets specified in "01_code/replicate.R".

All figures are stored as .pdf and .png figures in the subfolder "03_output/Graphs" or as latex or docx tables in the subfolder "03_output/Tables".

## 2. REPLICATION CODE

All replication code can be found in the code subfolder. Below is a table describing each script and its function.

| File | Inputs | Output | Description |
| ---- | ------| ------ | ----------- |
| country_analysis.R | country_data.csv | country_similarity_regime.pdf, country_difference_regime.pdf, sa_similarity_regime.pdf, sa_similarity_regime_summed.pdf, country_se_robustness.pdf, table_country_concept_reg.tex, table_country_concept_reg_poly.tex, table_country_concept_reg_delib.tex, polity_table_se.tex | Produces analysis of WeChat media framing of countries in text |
| diaspora_analysis.R | diaspora_data.csv | affiliation_attribute_ethnicity_crossection.pdf, affiliation_attribute_ethnicity_crossection_ungrouped.pdf, ethnic_se_robustness.pdf, table_ethnic.tex, table_ethnic_group.tex, ethnic_table_se.tex | Produces analysis of WeChat framing of diaspora  |
| misc_tables.R | dictionary_countryattributes.xlsx, dictionary_countrynames.xlsx, dictionary_diaspora.xlsx, dictionary_racism.xlsx, dictionary_socialgroups.xlsx, dictionary_violence.xlsx |diaspora_dict.tex, country_dict.tex, acct_metadata.tex | Produces dictionary tables shown in paper's appendix, as well as a table describing article counts by WeChat subscription account |
| misc_figures.R | hate_crimes_complaints_sum.csv, sum_data.csv, term_data.csv | hate_crime_complaints.pdf, term_freq_time.pdf | Produces hate crimes and term frequency figures shown in the article. |


## 3. REPLICATION DATA

### country_data.csv

- This file includes cosine similarity scores representing the associations between a dictionary of country names and a dictionary of country-level attributes, such as chaos, corruption, and a placebo (sports).
- These similarity scores were produced using skip gram embeddings fit at the subcorpus level.
- The data used to produce these scores was scraped from WeChat using the [wechat_articles_scraper](https://github.com/wnma3mz/wechat_articles_spider) python package on Github.

### diaspora_data.csv

- This file includes cosine similarity scores representing the associations between a dictionary of ethnic groups and a dictionary of two attributes: violence and racism
- These similarity scores were produced using skip gram embeddings fit at the subcorpus level.
- The data used to produce these scores was scraped from WeChat, as described above.

### Dictionary files
- Refers to the following files located in `02_data`:
    - dictionary_countryattributes.xlsx
	- dictionary_countrynames.xlsx
	- dictionary_diaspora.xlsx
	- dictionary_racism.xlsx
	- dictionary_socialgroups.xlsx
	- dictionary_violence.xlsx
- Each dictionary was produced using a combination of subject knowledge and assistance from the [`conclust`](https://github.com/pchest/conclust) algorithm.

### Other Data

- *hate_crimes_complaints_sum.csv:* Contains aggregated hate crime data obtained from the New York City Police Department.
- *sum_data.csv:* Contains aggregated article counts by WeChat official account.
- *term_data.csv:* Contains summarized term frequency counts by WeChat subscription account and time period.

## 4. SYSTEM INFORMATION

- All code was tested and ran without error on R version 4.4.0 on Pop_OS 22.04 Linux.

## 5. GENERAL REMARKS

For questions or comments, do not hesitate to contact Patrick Chester (patrickjchester@gmail.com) or Audrye Wong (audryewong@gmail.com).

This repository is available through the APSR Dataverse.
